Fast Parallel File Replication In Data Grid

نویسندگان

  • Rauf Izmailov
  • Samrat Ganguly
  • Nan Tu
چکیده

Parallel file replication where a large file needs to be simultaneously replicated to multiple sites is an integral part of dataintensive grid environment. Current data transport mechanisms such as GridFTP is mainly created for point-to-point file transfer and not for parallel point-to-multipoint transfer (required in replication). This paper presents a tool that creates multiple distribution trees by pipelining point-to-point transfer (eg. GridFTP sessions) and optimizes the file replication time to multiple sites. Performance results from simulation and deployment of the tool in Internet show a significant speed up of up to 9 times using the proposed tool compared to just using point-to-point GridFTP.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity

The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...

متن کامل

E2DR: Energy Efficient Data Replication in Data Grid

Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...

متن کامل

Improving Data Grids Performance by Using Modified Dynamic Hierarchical Replication Strategy

Abstract: A Data Grid connects a collection of geographically distributed computational and storage resources that enables users to share data and other resources. Data replication, a technique much discussed by Data Grid researchers in recent years creates multiple copies of file and places them in various locations to shorten file access times. In this paper, a dynamic data replication strate...

متن کامل

Worldwide Fast File Replication on Grid Datafarm

The Grid Datafarm architecture is designed for global petascale data-intensive computing. It provides a global parallel filesystem with online petascale storage, scalable I/O bandwidth, and scalable parallel processing, and it can exploit local I/O in a grid of clusters with tens of thousands of nodes. One of features is that it manages file replicas in filesystem metadata for fault tolerance a...

متن کامل

Striped replication for the grid environment as a web service

The optimization of the access to the data resources is on of the key features in large distributed systems. Grid computing is the technology with the potential of becoming a new paradigm for the distributed processing in science and industry. Grids need a reliable and fast access to the distributed data. Data replication, creation of multiple copies of a single data file on distinct grid nodes...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004